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Abstract 


In  Johnson  et  at.  ( Conrnm .  Statist.  Theor.  Meth.  A9(9 ),  917-922)  and 


Johnson  and  Kotz  ( Proa .  ONR/ARO  Reliability  Workshop ,  April  1981),  the  authors  pr*o.ous' 

£ 

derived  the  distribution  of  the  number  of  items  observed  to  be  defective  in 
samples  from  a  finite  population,  when  false  identification  of  defectives  as 
well  as  incanplete  identification  is  taken  into  account.  The  corresponding 
distributions  of  waiting  times  until  a  specified  number  of  defective  items  is 
observed  were  also  obtained.  In  the  present  paper,  we  extend  some  of  these 
results  to  the  case  of'f1group  screening**''sanpling  schemes. _ 

Key  Words  and  Phrases:  group  screening;  binomial  distribution;  compound 

distributions;  faulty  identification;  hypergeometric 
distribution;  sampling  inspection;  waiting  time; 
incomplete  identification. 
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1.  Introduction  and  classification  of  faulty  hypergeometric 


In  recent  papers  (Johnson  et  at.  (1980)  and  Johnson  and  Kotz  (1981a)) ,  the 
authors  developed  several  models  for  incomplete  and  false  identification 
distributions,  originally  motivated  by  applications  in  auditing  (Sorkin  (1977)) 
and  in  quality  control.  These  models  can  be  viewed  as  a  new  variant  of  the 
damage  models  introduced  by  Rao  and  Rubin  (1964) ,  which  have  been  extensively 
studied  in  the  literature.  (See  Johnson  and  Kotz  (1981b)  for  a  survey  of  damage 
models  and  their  relation  to  faulty  inspection  models.) 

For  completeness  and  readers'  convenience,  we  shall  briefly  describe  the 
main  results  offered  in  Johnson  et  al.  (1980)  and  Johnson  and  Kotz  (1981a) . 

la)  Incomplete  identification. 


Consider  a  sample  of  size  n  without  replacement  from  a  lot  of  size  N 
conforming  X  defective  (or  nonconforming)  items,  when  inspection  detects  such 
items  with  probability  p  (0  <  p  s  1) .  It  is  assured  that  no  "correct"  items 
are  classified  as  defectives.  In  this  model,  the  overall  distribution  of  the 
total  number  of  identified  defectives,  Z  say,  is  found  to  be  a  conpound 
binomial  distribution. 

Binomial  (Y,p)  a  Hypergeometric  (n  ,X  ,N) ,  where  a  denotes  the  compounding 
operation  (Johnson  and  Kotz  (1969,  p.  184))  and  Y  denotes  the  actual 
(unobservable)  number  of  defective  items  in  a  random  sample  (without 
replacement)  of  size  n. 

The  formula  for  the  s^  descending  factorial  moment  of  Z  is 

,  -  E(Z<*>)  -  n(s)x(sV/n(5)  .  (1) 
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In  particular. 


E(Z)  -  pnX/N  ,  (2) 

which  is  the  Dean  of  a  hypergeometric  distribution  with  parameters  (n,Xp,N), 
formally  representing  the  distribution  of  defectives  in  a  sanple  (without 
replacanent)  of  size  n  from  a  population  of  size  N  containing  pX  defectives. 
The  variance  V(Z)  can  be  written  as 


Var(Z)  -  p2n  *(1-  |)  ♦  p(l-p)^  -  p2  Var(Z|p-l)  ♦  p(l-p)^  .  (3.1) 


or  alternatively  in  the  two  following  forms: 


Var(Z)  -  E2L  (l  -  ££)  ♦  fe-jj  p(l -p)  — 

w  (N-l)  N  1  NJ  (N-l)  PU  PJ  N 


(3.2) 


or 


Var(Z)  -  ^(1  -  $)  -  j$£jj-p2X(l  -  £)  •  (3.3) 

These  show  that  V(Z)  is  not  less  than  the  variance  of  the  hypergeometric  with 
parameters  (n,pX,N) ,  but  cannot  exceed  the  variance  of  a  binomial  with 
pmmmO..  f). 

The  corresponding  waiting  time  distribution  of  the  lumber  M  of  drawings 
(with  replacement)  of  items  needed  to  produce  a  (s  X)  defective  items 
raoognlaad  eu  auoh  (proper! zed  as  P(Mam)/P(MsN))  is  the  conpound  distribution 

Negative  Hypergeometric  (YtX,N)  a  Truncated  (Y  s  X)  Negative  Binomial  (a,  p”1- 1)  . 

(The  negative  binomial  is  truncated  from  above  at  Y  •  X  because  there  are  no 
more  than  X  defective  items.) 


PWWIMIJf I  mmM,  I| JI.I .  !H  Ijw  ^  1  W* 


It  seems  to  be  difficult  to  obtain  exact  egressions  for  the  moments  of  M. 
However,  if  the  truncation  to  values  Y  s  X  is  neglected,  the  5th  ascending 
factorial  moment  of  M  is  given  by 


E(Mlsl)  •  p"sals5(N+l)*SV(X+l)ls5  , 


(4) 


where  *  M(M+1)... (M+s -1) .  In  particular, 


E[M]  «  a(N4^- 
1  J  *  p(X+l) 


(5.1) 


and 

Var(M)  -  - -  ((N+2) (X+l)  -  p(X+l)(X+2)  -  a(N-X))  .  (5.2) 

*  p2(X+lj  (X+2) 

(See  Johnson  et  al.  (1980)  for  more  details.) 
lb)  False  and  incomplete  identification. 

In  Johnson  and  Kotz  (1981a) ,  the  model  described  in  (a)  was  extended  by 
allowing  for  a  probability,  p’ ,  of  erroneously  deciding  that  an  item  is 
defective  when  really  it  is  not.  (In  the  purely  incomplete  model,  p*  -  0.) 

In  this  case,  the  overall  distribution  of  the  total  number  of  items  called 
*• defectives ",  Z,  is  the  compound  distribution 

Binomial  (Y,p)  ♦  Binomial  (n-Y,  p')  a  Hypergeometric(n,X,N) 

(the  two  binomial  variables  are  mutually  independent). 

The  7th  descending  factorial  moment  of  Z  is  in  this  case  given  by 

„  CZ)  .  sill  f  (T)plp'r-)x'»(N.X)'r-J)  ;  C« 

w  N(r)  j*o  J 


in  particular. 


E(Z)  «  np/N 

where  p  ■  ftp  +  (N-X)p*  }/N,  and  the  variance  is 

VarfZ)  -  2^(1-  £)  -  -  (1  -  -)(p-p')2 

i  ;  N  1  n'  (N-l)  N  1  NMP  PJ 

(c.f.  corresponding  expression  for  the  variance  of  Z  in  the  case  la).  Tables 
of  the  distribution  of  Z  f  or  p  -  .75(.05).95;  p'  -  0(.025).l;  N  •  100, 

X  ■  5, 10,  20;  N  *  200,  X  *  10,  20,  40;  and  n  •  10  are  presented  in  Johnson  and 
Kotz  (1981a).  More  detailed  tables  nay  be  obtained  by  writing  to  S.  Kbtz. 

The  distributions  are  quite  sensitive  to  the  values  of  p’,  but  not  to  the 

Y 

values  of  the  ratio  ^ .  In  fact,  as  N  and  X  are  increased  proportionately  to 
each  other  with  X/N  •  X,  say,  the  other  parameters  (n,p,p')  remaining 
constant,  the  distribution  of  Z  tends  to  a  binomial  with  parameters 
n,  XN"*p  +  (l-XN’^)p* .  The  waiting  time  distribution  (i.e.  the  distribution 
of  the  number  of  items  M,  say,  needed  to  be  inspected  one  at  a  time  until  a 
predetermined  number  a  of  items  have  been  assessed  as  "defective")  seems  to  be 
difficult  to  derive.  Using  a  conditioning  argument,  Johnson  and  Kotz  (1981a) 
obtained  close  approximations  and  bounds  on  the  values  of  E(M)  and  Var(M)  in 
this  case. 

These  are 


E(M)  «  ap_1[l  ♦  “II  ♦  *  -  -^2  ftp2  ♦  (N-X)p,2)l  (7.1) 

**  Np 

end 


J  •  &J  i  Var(M)  a  iikll(l+  I)  ,  (7.2) 
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As  N  ♦  •  ,  E(M)  approaches  ap  and  the  variance  tends  to  ap  (1»$). 
these  are  the  mean  and  variance,  respectively,  of  the  (negative  binomial) 
waiting  time  distribution  for  occurrence  of  a  "successes"  in  independent  trials 
with  probability  of  success  equal  to  p  at  each  trial. 

These  results  can  easily  be  generalized  to  the  case  of  stratified 
populations  there  the  lot  is  divided  into  k  strata  of  sizes  X2,X2,...,  X^ 

(Z  '-j  *  N)  such  that  for  any  chosen  individual  in  the  stratum,  the 
probability  of  "detection  as  defective"  (whether  this  is  really  so  or  not)  is 
Pj.  The  case  considered  above  corresponds  to  k  *  2,  p2  *  p,  and  p2  ■  p'.  See 
Johnson  and  Kotz  (1981a)  for  more  details. 

2.  Group  screening  model  involving  incomplete  and  false  identification. 

Further  interesting  distributions  arise  in  connection  with  "group 
screening"  (Dorfman  (1943)),  in  which  groups  of  units  can  be  tested  for  the 
existence  of  one  or  more  defective  units  among  them.  This  can  be  practicable, 
for  example,  when  testing  liquids  for  presence  of  contaminants,  and  is  then 
suggested  as  a  possible  way  of  reducing  the  average  total  amount  of  testing. 

Suppose  that  material  from  n  units  is  mixed  and  tested  for  presence  of 
"defective"  material.  If  a  negative  result  ("no  defectives")  is  obtained,  no 
further  action  is  taken,  but  if  there  is  a  positive  result,  each  unit  is  tested 
separately. 

Let  Pq»Pq  denote  the  probabilities  of  obtaining  correct  or  incorrect 
positive  results,  respectively,  at  the  first  test.  As  before,  p,p*  denote  the 
probabilities  of  correct  or  incorrect  positive  results,  respectively,  when 
wits  ere  tested  individually;  X,N  denote  the  number  of  defective  units  and  the 
total  number  of  units  in  the  population  respectively,  and  Y  denotes  the  actual 
number  of  defective  units  among  the  n  tested. 


* 
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The  overall  probability  of  obtaining  a  positive  result  on  the  first  test 
is 

U  -  P(Y-0)}P()  ♦  P(Y«0)Pq  -  (1  -  )pQ  ♦  V'Q  .  (8) 

jjW 

As  before,  Z  will  denote  the  lumber  of  units  called  "defective"  as  a 
result  of  the  test. 

When  Y  -  0,  the  conditional  distribution  of  Z  is  binomial  with  parameters 
n,p'  plus  "added  zeroes"  (corresponding  to  a  negative  result  on  the  first 
test) : 

P(Z-0|Y-0)  •  1  -  p 'Q  +  p£(l-p’)n 

(9) 

P(Z-z)Y-O)  -  p£(£)p,Z(l-p')n'Z  (z  -  1,2 . n)  . 

When  Y  ■  y  >  0,  the  conditional  distribution  of  Z  is  that  of  the  sum  of  two 
independent  binomial  variables  with  parameters  (y,p) ,  (n-y,  p')  plus  "added 
zeroes": 


P(Z-0|Y-y)  -  1  -  p0  ♦  p0(l-p)y  (l-p’)n‘y  (y  >  0) 

P(>z|Y-y)  -  p0  f  (J)p*(l-p)y"*  Qp'^d-P'r^ 

(y  >  0;  z  -  1,2 . n)  . 


(10) 


The  overall  distribution  of  Z  is  obtained  by  confounding  (9)  and  (10)  with  a 
hypergeometric  distribution  (parameters  n,X,N)  for  Y.  The  r^1  factorial 
moment  of  2  is 


■pWj  -  nwJ 


j^irj  i«o 


(V>>o)p,T(N"x)(n)] 

^5 


(11) 
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Formula  (11)  can  be  obtained  by  noting  that  formally  the  distribution  of  Z 
is  a  mixture  of 

(a)  Binomial (Y,p)  ♦  Binomial(n-Y,  p')  a  Hypergeometric (n ,X ,N)  with 

Y 

probability  Pq, 

00  Binomial (n,p*)  with  "probability”  (p^-p0)P(Y*0) ,  and 
(c)  0  with  probability  (1-Pq)P(Y>0)  +  (1-Pq)P(Y«0)  . 

(Note  that  the  "probability"  for  (b)  can  be  negative;  indeed,  it  is  quite 
likely  that  p£  <  p.) 


In  particular. 


E[Z]  -  n(pQp  -  Ppf)  , 


(12.1) 


where  as  before,  p  -XN_1p  -  (l-XN'1)?’;  P  -  (Pq'Pq)  (N-X)  (n)/N(n),  and 


Var(Z)  -  n(n-l)[^-{Np2  -  N"*(Xp2  +  N -  X  •  p’2 
N-l 


p’2)}  -  pp*2] 


+  n(pnp  -Pp’)  -  n2(pnp  -Pp’)2  • 


(12.2) 


In  general,  it  would  seem  that  pQ  >  Pq  just  as  p  >  p’,  since  we  would 
expect  (hope)  that  the  probability  of  correct  decision  would  exceed  that  of 
incorrect  decision.  It  may  well  happen  that  Pq  <  P  since  detection  of  a 
defective  may  be  more  difficult  with  the  mixture  of  material  from  separate 
wits.  More  complicated  distributions  will  be  obtained  if  it  is  supposed  that 
Pq  depends  on  the  value  of  Y  (the  number  of  defective  units) .  It  does  not 
seem  unreasonable  to  suppose  that  pQ  might  increase  with  Y. 
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